Impact of HIV-1 genetic diversity on disease progression: a prospective cohort study in Guangxi

The high proportion of AIDS cases and mortality rates in Guangxi underscores the urgency to investigate the influence of HIV-1 genetic diversity on disease progression in this region. Newly diagnosed HIV-1 patients were enrolled from January 2016 to December 2021, and the follow-up work and detection of CD4+T lymphocytes were carried out every six months until December 2022. Multivariate logistic regression was used to analyze the factors affecting pre-treatment CD4+T lymphocyte counts, while local weighted regression models (LOESS) and generalized estimating equation models (GEE) were conducted to assess factors influencing CD4+T Lymphocyte Recovery. Cox regression analysis was utilized to examine the impact of subtypes on survival risk. Additionally, HIV-1 env sequences were utilized for predicting CXCR4 and CCR5 receptors. The study encompassed 1867 individuals with pol sequences and 281 with env sequences. Our findings indicate that age over 30, divorced/widowed, peasant, heterosexual infection, CRF01_AE, long-term infection, and Pre-treatment Viral load >10000 copies/ml were factors associated with higher risk for pre-treatment CD4+T lymphocyte decline. Specifically, male gender, age over 30, heterosexual infection (HETs), long-term infection, CRF01_AE, and Pre-treatment CD4 T cell counts below 350/µL were identified as risk factors impeding CD4+T lymphocyte recovery. Pre-treatment CD4+T lymphocyte counts and recovery in individuals infected with CRF01_AE were lower compared to CRF07_BC and CRF55_01B. Additionally, CRF01_AE and CRF08_BC subtypes exhibited higher mortality rates than CRF07_BC, CRF55_01B, and other subtypes. Notably, CRF01_AE demonstrated the highest percentage of CXCR4 affinity ratios. This research unveils the intricate influence of HIV-1 gene diversity on CD4+T lymphocyte dynamics and clinical outcomes. It highlights the multifaceted nature of HIV infection in Guangxi, providing novel insights into subtype-specific disease progression among HIV-infected individuals in this region.


Introduction
HIV-1 exhibits remarkable genetic diversity, leading to distinctive coreceptor utilization and varying disease progression among its diverse subtypes and circulating strains (Poveda et al., 2006).Research underscores the differing clinical presentations and immune system impairments associated with various HIV-1 subtypes post-infection.Particularly, individuals infected with subtype D in African regions exhibit higher mortality risks and are more prone to advancing to the AIDS stage compared to those with subtype A (Vasan et al., 2006;Baeten et al., 2007;Keller et al., 2009).Studies in the United Kingdom further affirm the accelerated disease progression in subtype D-infected individuals relative to other subtypes (Easterbrook et al., 2010).Investigations focusing on the CRF01_AE subtype highlight a significantly rapid disease progression from HIV-1 infection to clinical AIDS stage and a drop in CD4+ T cell count below 200/µL, with median times of 7.2 years and 6.5 years, respectively (Rangsin et al., 2007).Moreover, studies have noted a higher prevalence of CXCR4 tropism within CRF01_AE subtypes, correlating with rapid disease progression (Utaipat et al., 2002).Additionally, substantial variations exist among distinct CRF01_AE epidemic clusters (Song et al., 2019).Similarly, among patients with the CRF02_AG subtype, studies have reported an 86% usage rate of CXCR4 tropism, associated with rapid disease progression (Esbjornsson et al., 2010).
Guangxi, a southwestern Chinese province bordering Vietnam, faces a severe HIV-1 epidemic, observing a relatively high proportion of newly diagnosed HIV-1 infected individuals diagnosed as AIDS (Chen et al., 2019;Sun et al., 2020).However, it remains unclear whether the increased proportion of AIDS patients arises from late detection or other contributing factors.Currently, prevalent HIV-1 subtypes in Guangxi include CRF01_AE, CRF07_BC, CRF08_BC, and CRF55_01B.The potential association between rapid disease progression and HIV-1 subtypes necessitates further investigation.This study aims to encompass newly diagnosed HIV-1 infected individuals in a prospective study, intending to investigate the impact of subtypes on CD4+ T lymphocyte count and mortality.

Study participants and sample collection
Between January 2016 and December 2021, participants were recruited from Voluntary Counseling and Testing Centers (VCT) and Non-Governmental Organizations (NGO) in Guangxi.Inclusion criteria were as follows: 1) newly diagnosed with HIV-1; 2) not initiated on Antiretroviral Therapy (ART); 3) aged ≥ 18 years and provided informed consent.Peripheral blood samples and epidemiological data were collected.Plasma isolation occurred within 12 hours of collection and was stored at -80°C for subsequent sequencing.

Follow-up and data collection
Upon HIV-1 diagnosis, a demographic questionnaire was administered, followed by sample collection, pre-antiviral CD4+T lymphocyte and HIV-1 viral load testing, and initiation of antiviral therapy.Follow-up took place every six months, involving sample collection for CD4+T lymphocyte detection until December 2022.

Sequence processing and HIV-1 subtyping
Sequences were edited using Sequencher v5.1 software (Genecodes, Ann Arbor, MI) and aligned using BioEdit 7.1 software (Ibis Biosciences, Carlsbad, CA, USA).Subtyping utilized 117 reference sequences encompassing all Chinese subtypes from the Los Alamos HIV database.Phylogenetic trees were constructed using MEGA 11.0 software employing the neighbor-joining method for subtype identification.

Mutation analysis
Amino acid sequences in the V3 region underwent mutation analysis.Sequences were saved in fasta format and analyzed using the WebLogo website (http://weblogo.berke-ley.edu/logo.cgi).

Statistical analysis
Demographic information was represented in frequency and percentage.Mann-Whitney U test compared pre-treatment CD4+T lymphocyte counts among different subtypes.Chi-square tests assessed CD4+T lymphocyte distribution pre-treatment.Logistic regression was employed to analyze factors influencing pretreatment CD4+T lymphocytes, while post-treatment CD4+T lymphocytes recovery utilized LOESS and GEE models.Significance was set at p < 0.05 using IBM SPSS 26 for statistical analysis, Python for GEE analysis, and GraphPad Prism 9 for visualization.

Characterization of the study population
A total of 1867 individuals were included in the study, including 956 cases involved men who have sex with men (MSM), 836 HET, and 75 cases from other infection routes.In the study population, we enrolled a higher number of HIV-1 infected men (77.03%) compared to women (22.97%).This distribution reflects the actual epidemiological patterns in Guangxi, where HIV-1 prevalence is significantly higher among men.The prevalent HIV-1 subtypes included CRF07_BC (36.48%),CRF01_AE (35.67%),CRF08_BC (12.32%),CRF55_01B (8.25%), and other (7.28%) (Supplementary Figure 1).Statistically significant differences were observed among subtypes concerning gender, age, marital status, education, occupation, infection route, infection time, pre-treatment CD4 T cell counts, pre-treatment Viral load, clinical stage, and drug resistance before treatment (p < 0.05) (Table 1).

Factors influencing pre-treatment CD4+T lymphocytes and CD4+T lymphocyte recovery
Multivariate logistic regression was utilized to analyze the risk factors affecting pre-Treatment CD4+T lymphocytes.In comparison to factors such as age below 30, Unmarried, Unemployed, other Infection route, and subtypes like CRF07_BC and CRF08_BC, recent infection, and Pre-treatment Viral load <10000 copies/ml, factors associated with a higher risk for pre-treatment CD4+T lymphocyte decline included age over 30, divorced/widowed, peasant, heterosexual infection, CRF01_AE subtype, long-term infection, and pre-treatment viral load >10000 copies/ml.These factors were identified as significant contributors to pre-treatment CD4+T lymphocyte decline (Supplementary Table 1).
Generalized Estimation Equation analysis was employed to investigate factors influencing CD4+T lymphocyte recovery.The analysis revealed that infection time, HIV-1 subtype, infection route, gender, age, pre-treatment CD4+T lymphocyte count, clinical stage, treatment regimen, duration time for initiated treatment, and treatment time significantly impacted posttreatment CD4+T lymphocyte recovery.Specifically, male, age over 30, heterosexual infection, long-term infection, CRF01_AE subtype, and pre-treatment CD4 T cell counts < 350/µL, clinical III and IV stage, and longer initiated treatment time were identified as risk factors impeding CD4+T lymphocyte recovery.Interestingly, 3TC+AZT+LPV/r and 3TC+LPV/r+TDF were more conducive to CD4 T cell recovery than 3TC+EFV+TDF, 3TC+AZT+EFV and other.Longer treatment time was beneficial for CD4 T cell recovery (Table 2).

Effects of subtypes on pre-treatment CD4+T lymphocytes and CD4+T lymphocyte recovery
A comparative analysis was conducted to evaluate the impact of various HIV-1 subtypes on pre-treatment CD4+T lymphocytes.The results revealed significantly lower absolute CD4+T lymphocyte counts among individuals infected with CRF01_AE subtype compared to other subtypes.Conversely, those infected with CRF07_BC subtype demonstrated notably higher CD4+T lymphocyte counts compared to CRF08_BC and CRF55_01B (Figure 1A).This trend persisted at the subtype level, where the MSM population demonstrated significantly higher pre-treatment CD4+T lymphocyte counts across CRF01_AE, CRF07_BC, and CRF08_BC subtypes than HET population (p < 0.001), with no notable difference in the CRF55_01B subtype (Figure 1B).
The LOESS model was used to fit the recovery trend of CD4 cells after infection with different subtypes.Plotted data included scatter plots and 95% confidence intervals (CI).Higher fitting curves were observed for CRF07_BC and CRF55_01B subtypes compared to CRF01_AE and CRF08_BC subtypes (Figure 2).Additionally, we further analyzed the recovery trend of CD4 cells after infection with different subtypes in MSM and HET populations.CRF07_BC infection exhibited a more pronounced recovery effect in the MSM population, while CRF01_AE infection showed comparatively slower recovery.In the HET population, CRF07_BC infection demonstrated the most favorable recovery, followed by CRF08_BC infection, whereas CRF01_AE infection exhibited the least favorable recovery.Intriguingly, the recovery effect of CRF01_AE and CRF07_BC within the MSM population surpassed that observed within the HET population (Supplementary Figure 2).

Discussion
Our study aimed to delve into the correlation between HIV-1 genetic diversity and disease progression by examining CD4+T lymphocyte counts, and longitudinal data obtained from HIV-1 patients over various timeframes.We concurrently sequenced HIV-1 env genes in this cohort and utilized online tools to predict CXCR4 and CCR5 receptor affinity.Our findings indicate associations between different subtypes and pre-treatment CD4+T lymphocyte counts, CD4+T lymphocyte recovery, and mortality.These results suggest potential influences of various subtypes on immune status and the receptor tropism of HIV-1infected individuals.
Notably, infection with the CRF01_AE subtype displayed lower pre-treatment CD4+T lymphocyte counts, likely attributable to its higher CXCR4 receptor ratio.This heightened tropism might contribute to rapid CD4+T lymphocyte decline and accelerated disease progression (Koot et al., 1993).Originating in Central Africa, CRF01_AE is now endemic in Southeast and East Asia.Studies have linked this subtype to a high proportion of CXCR4tropic virus, hastening AIDS development and immune failure (Kaufmann et al., 2005;Kelley et al., 2009).However, our research indicates that the CXCR4 receptor tropism ratio in Guangxi population is lower compared to other regions (Li et al., 2016;Cui et al., 2019), suggesting additional factors affecting CD4+T lymphocyte counts.Factors such as advanced age, peasant, heterosexual infection, long-term infection, and higher viral load significantly impact CD4+T lymphocytes.Previous studies (Ge et al., 2019) in Guangxi reported a higher proportion of individuals among age over 50, peasant, and heterosexual infection, contributing to low pre-treatment CD4 T lymphocyte counts, which affect treatment efficacy.Additionally, patients with the CRF07_BC subtype exhibited improved CD4+T lymphocyte recovery, consistent with prior literature (Jiang et al., 2016).We hypothesize that compared to other subtypes, CRF07_BC infection might delay CD4+T lymphocyte decline and clinical progression in HIV-1 patients, potentially extending survival time and increasing transmission risks.The study also revealed disparities in CD4+T lymphocyte recovery between MSM and heterosexual populations infected with the CRF07_BC subtype, potentially contributing to the subtype's rapid increase in the Guangxi MSM population.One potential hypothesis is that the CRF07_BC subtype may be associated with a less aggressive disease course, allowing for better immune system preservation, especially in populations with early and consistent access to healthcare.The higher pre-treatment CD4+T lymphocyte counts in MSM populations could also be attributed to earlier diagnosis and initiation of ART, reflecting the importance of timely medical intervention in managing HIV infection.In conclusion, the disparities observed in our study highlight the need for further research to understand the underlying mechanisms driving these differences.Future studies  Moreover, employing the 11/25 rule, which correlates amino acid residues at positions 11 or 25 of the V3 region with CXCR4 receptor tropism (Sander et al., 2007), we observed relatively low amino acid substitutions in the Guangxi population.Specifically, 11R accounted for 21.1% (4/19) of CXCR4 receptor tropism, primarily within the CRF55_01B subtype.Furthermore, mutations at sites 11 and 25 were found in CRF07_BC and CRF59_01B subtypes, indicating diverse evolutionary pathways for tropism conversion.Studies from France have highlighted multiple amino acid residue substitutions, including S5Y, N7K, S11R, T12V, T12F, Q18R, I27T, and S32R in the V3 region of the CRF01_AE subtype.The substitution of arginine (R) for serine (S) at site 11 is pivotal for CCR5 tropic strains' transformation into CXCR4 tropic strains (Shoombuatong et al., 2012;Hongjaisee et al., 2017).Our study identified mutations like S11R, I12T, V12T, T19V, A22R, and N29D as key sites leading to CD4+T cell number decline.These mutations were pivotal for the transformation from CCR5 to CXCR4 tropism is particularly noteworthy.Notably, 12T (46.7%) and 11R (100%) exhibited higher CXCR4 tropism.Notably, the CRF01_AE subtype exhibited a higher proportion of CXCR4-tropic viruses, potentially contributing to a more rapid decline in CD4+T lymphocytes and accelerated disease progression.However, the S11R mutation in our study was mainly concentrated in CRF55_01B, which is why the CRF55_01B subtype displayed 14.81% CXCR4 affinity ratios.This indicates that the S11R mutation is not only the specific site of CRF01_AE subtype, but also that the high proportion of CXCR4 in the CRF01_AE subtype in Guangxi is not due to the S11R mutation but may be affected by other mutation.For example, I12T, V12T, T19V, 273, A22R, and N29D.Which mutation affects CRF01_AE subtype tropism selection, whether it is single mutation or double mutation, needs to be further studied.This reflects that the mutation affecting the tropism selection of subtype in different regions are different.This subtype-specific characteristic could be a significant factor in understanding the disease dynamics in the Guangxi region.However, there is a recognized lack of consensus regarding the necessary substitutions for CXCR4 tropism in this subtype (Hongjaisee et al., 2017).This discrepancy highlights the necessity for biological evaluation of coreceptor tropism of these variants to confirm tropism.Our research sheds light on the relationship between HIV-1 V3 region sequence variation and coreceptor preference in Guangxi's populations, identifying characteristic mutations potentially linked to the geographical environment.
While our study provides valuable insights into the correlation between HIV-1 genetic diversity and disease progression in Guangxi, it is imperative to acknowledge certain limitations.Firstly, our research predominantly focuses on the prevalent subtypes in Guangxi, namely CRF01_AE, CRF07_BC, CRF08_BC, and CRF55_01B.The exclusion of less common subtypes may limit the overall representativeness of our results.Additionally, the collected data may introduce inherent biases and potential inaccuracies in patient histories.For example, women were significantly underrepresented in the sample, reflecting the higher prevalence of HIV-1 among men in Guangxi.This gender disparity may limit the generalizability of our conclusions to the female population.Furthermore, the predictive analysis of receptor tropism, while informative, is based on computational tools and sequencing data, which may not fully capture the in vivo complexity of viral dynamics.Lastly, the study's scope is limited to Guangxi, and caution should be exercised when extrapolating these findings to other regions with distinct epidemiological profiles.Despite these limitations, our research contributes valuable insights to the existing body of knowledge and lays the groundwork for future investigations in this critical area of HIV research.

Conclusion
Our study emphasizes the association between pre-treatment CD4+T lymphocyte counts, recovery, survival risk, and HIV-1 gene diversity in Guangxi.Lower CD4+T lymphocyte counts and slower recovery in CRF01_AE infected individuals contrast with higher counts and faster recovery in CRF07_BC infected individuals, potentially contributing to the increased AIDS prevalence and rapid CRF07_BC expansion in Guangxi.Multiple influencing factors and distinct mutation further impact CD4+T lymphocyte counts and recovery in HIV-infected individuals in Guangxi.

FIGURE 2
FIGURE 2Effects of different subtypes on CD4+T lymphocyte recovery post-Treatment.LOESS model was utilized to predict post-treatment CD4+T lymphocytes recovery, color indicates different subtypes.
FIGURE 3Effects of subtypes on survival risk.(A) The COX regression model was utilized to evaluate the influence of various HIV-1 subtypes on survival risk; (B) The univariate COX regression was used to analysis the survival risk across subtypes incorporating gender, age, marital status, education, occupation, infection route, pre-treatment CD4 T cell counts, pre-treatment Viral load, clinical stage, drug resistance, treatment regimen, duration time for initiated treatment and treatment.ns represents P>0.05.

TABLE 1
Characterization of the study population among various subtypes.

TABLE 1 Continued
Percentage number under parentheses, Chi-square test was used to analyze the difference of DRM frequencies between Han and Zhuang.

TABLE 2 Continued
GEE model was utilized to analyze these factors associated with CD4+T Lymphocytes recovery, PDR: Pre-treatment drug resistance.